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1 Performance issues of a distributed frame buffer on a multicomputer 
Bin Wei, Douglas W. Clark, Edward W. Felten, Kai Li, Gordon Stoll 
August 1998 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on 
Graphics hardware 

Full text available: 1§| pdfn.63 MB) Additional Information: full citation , references , citings , index terms 



Keywords: multi-port distributed frame buffer, multicomputer, parallel rendering, 
synchronization 



VC-1: a scalable graphics computer with virtual local frame buffers 
Satoshi Nishimura, Tosiyasu L. Kunii 

August 1996 Proceedings of the 23rd annual conference on Computer graphics and 
interactive techniques 

Full text available: fl| pdf(266.19 KB) Additional Information: full citation , references , index terms 



Keywords: demand paging, frame buffers, parallel polygon rendering, scalable 



The design of a parallel graphics interface 
Homan Igehy, Gordon Stoll, Pat Hanrahan 

July 1998 Proceedings of the 25th annual conference on Computer graphics and 
interactive techniques 

Full text available: ^ ) pdf(389.52 KB) Additional Information: full citation , references , citings , index terms 



InfiniteReality: a real-time graphics system 

John S. Montrym, Daniel R. Baum, David L Dignam, Christopher J. Migdal 
August 1997 Proceedings of the 24th annual conference on Computer graphics and 
interactive techniques 

Full text available: ^ pdf(697.27 KB) Additional Information: full citation , references , citings , index terms 
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5 Hardware accelerated rendering of antialiasing using a modified a-buffer algorithm 
Stephanie Winner, Mike Kelley, Brent Pease, Bill Rivard, Alex Yen 

August 1997 Proceedings of the 24th annual conference on Computer graphics and 
interactive techniques 

Full text available: Bpdfd 13.06 KB) Additional Information: full citation , references , citings , index terms 



Keywords: antialiasing, image partitioning, plane equation evaluation, scanline, texture 
mapping, transparency 



6 Hybrid volume and polygon rendering with cube hardware 
Kevin Kreeger, Arie Kaufman 

July 1999 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics 
hardware 

Full text available: *^ pdf(1.85 MB) Additional Information: full citation , references , citings , index terms 



Keywords: cube architecture, mixing polygons and volumes, ray casting, run-length- 
encoding, volume rendering 



Combatting rendering latency 

Marc Olano, Jon Cohen, Mark Mine, Gary Bishop 

April 1995 Proceedings of the 1995 symposium on Interactive 3D graphics 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: 1 



Latency or lag in an interactive graphics system is the delay between user input and 
displayed output. We have found latency and the apparent bobbing and swimming of objects 
that it produces to be a serious problem for head-mounted display (HMD) and augmented 
reality applications. At UNC, we have been investigating a number of ways to reduce 
latency; we present two of these. Slats is an experimental rendering system for our Pixel- 
Planes 5 graphics machine guaranteeing a constant single NTSC ... 

8 Talisman: commodity realtime 3D graphics for the PC | 
Jay Torborg, James T. Kajiya 

August 1996 Proceedings of the 23rd annual conference on Computer graphics and 
interactive techniques 

Full text available: 1 11 pdf(107.48 KB) Additional Information: full citation , references , citings , index terms 



9 Dissertation Abstracts in Computer Graphics 

January 1992 ACM SIGGRAPH Computer Graphics, Volume 26 issue l 

Full text available: i Sil pdf(2.53 MB) Additional Information: full citation 



10 Accelerated walkthrough of large spline models 

Subodh Kumar, Dinesh Manocha, Hansong Zhang, Kenneth E. Hoff 

April 1997 Proceedings of the 1997 symposium on Interactive 3D graphics 
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11 Fast data parallel polygon rendering 
F. A. Ortega, C. D. Hansen, J. P. Ahrens 

December 1993 Proceedings of the 1993 ACM/IEEE conference on Supercomputing 

Full text available:^ pdfd .65 MB) Additional Information: full citation , references , citings , index terms 



12 The rendering architecture of the DN10000VS 
David Kirk, Douglas Voorhies 

September 1990 ACM SIGGRAPH Computer Graphics , Proceedings of the 17th annual 
conference on Computer graphics and interactive techniques, volume 24 
Issue 4 

Full text available' H^l pdf(4 07 MB) Additional Information: full citation , abstract , references , citings , index 
' "8l p terms 

The Appollo DN10000VS treats graphics as an integral part of the system architecture. 
Graphics requirements influence the entire system design. All floating-point computations 
for graphics are performed by the CPU(s), while rasterizing is handled by simplified 
hardware having no microcode. We decided to support alpha buffering, quadratic 
interpolation, and texture mapping directly in hardware. This partitioning reduces the cost of 
a high-end workstation, without sacrificing high rendering qualit ... 

13 The triangle processor and normal vector shader: a VLSI system for high performance | 
graphics 

Michael Deering, Stephanie Winner, Bic Schediwy, Chris Duffy, Neil Hunt 

June 1988 ACM SIGGRAPH Computer Graphics , Proceedings of the 15th annual 

conference on Computer graphics and interactive techniques, volume 22 issue 4 
Full text available: fg| pdf(2.29 MB) Additional Information: full citation , abstract , references , citings , index 
■ terms 

Current affordable architectures for high-speed display of shaded 3D objects operate orders 
of magnitude too slowly. Recent advances in floating point chip technology have outpaced 
polygon fill time, making the memory access bottleneck between the drawing processor and 
the frame buffer the most significant factor to be accelerated. Massively parallel VLSI 
system have the potential to bypass this bottleneck, but to date only at very high cost. We 
describe a new more affordable VLSI solution. A pi ... 

Keywords: graphics VLSI, hardware lighting models, interpolation, real-time image display, 
shading, triangle processor 



14 Scalable distributed visualization using off-the-shelf components 
Alan Heirich, Laurent Moll 

October 1999 Proceedings of the 1999 IEEE symposium on Parallel visualization and 
graphics 

Full text available: fjpdfd.81 MB) Additional Information: full citation , abstract , references , citings, index 

This paper describes a visualization architecture for scalable computer systems. The 
architecture is currently being prototyped for use in Beowulf-class clustered systems. A set 
of OpenGL frame buffers are driven in parallel by a set of CPUs. The visualization 
architecture merges the contents of these frame buffers by user-programmable associative 
and commulative combining operations. The system hardware is built from off-the-shelf 
components including OpenGL accelerators, Field Programmabl ... 
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15 A scalable parallel cell-projection volume rendering algorithm for three-dimensional 

unstructured data 

Kwan-Liu Ma, Thomas W. Crockett 

October 1997 Proceedings of the IEEE symposium on Parallel rendering 

Full text available: ^ pdf(1.67 MB) Additional Information: full citation , references , citings , index terms 



Keywords: asynchronous communication, distributed memory, hierarchical data structures, 
load balancing, message passing, parallel algorithms, scientific visualization, unstructured 
grids, volume rendering 



16 Hardware acceleration for Window systems 
D. Rhoden, C. Wilcox 

July 1989 ACM SIGGRAPH Computer Graphics , Proceedings of the 16th annual 

conference on Computer graphics and interactive techniques, volume 23 issue 3 

i- ii* ^ i ui 01 JW „ oh ud\ Additional Information: full citation , abstract , references , citings , index 

Full text available: 1 ppdf(1.81 MB) — *~ 

terms 

Graphics pipelines are quickly evolving to support multitasking workstations. The driving 
force behind this evolution is the window system, which must provide high performance 
graphics within multiple windows, while maintaining interactivity. The virtual graphics 
system presented by [7] provides a clean solution to the problem of context switching 
graphics hardware between processes, but does not solve all the problems associated with 
sharing graphics pipelines.The primary difficulty in context ... 

17 PixelFlow: high-speed rendering using image composition 
Steven Molnar, John Eyles, John Poulton 

July 1992 ACM SIGGRAPH Computer Graphics , Proceedings of the 19th annual 

conference on Computer graphics and interactive techniques, Volume 26 issue 2 
Full text available: Wi pdf(2.31 MB) Additional Information: full citation , references , citings , index terms 



Keywords: anialiasing, compositing, deferred shading, rendering, scalable 



18 A task adaptive parallel graphics renderer 
Scott Whitman 

November 1993 Proceedings of the 1993 symposium on Parallel rendering 

Full text available: ^ pdf(1.15 MB) Additional Information: full citation , references , citings , index terms, review 



19 Fast detection of communication patterns in distributed executions 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced Studies 
on Collaborative research 

Full text available: ^pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based on 
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process-time diagrams are often used to obtain a better understanding of the execution of 
the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not provide 
the user with the desired overview of the application. In our experience, such tools display 
repeated occurrences of non-trivial commun ... 

20 A M1MD rendering algorithm for distributed memory architectures 
Thomas W. Crockett, Tobias Orloff 

November 1993 Proceedings of the 1993 symposium on Parallel rendering 

Full text available: *Qpdf(1.16 MB) Additional Information: full citation , references , citings , index terms 



Keywords: asynchronous algorithms, multiprocessors, parallel polygon rendering, 
performance analysis 
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Performance analysis and design of a logic simulation machine 
K. Wong, M. A. Franklin 

June 1987 Proceedings of the 14th annual international symposium on Computer 
architecture 

Additional Information: full citation , abstract , references , citings, index 
terms 



Full text available: ^ pdf(941.01 KB) 



The high costs associated with logic simulation of large VLSI circuits has led to the need for 
new computer architectures tailored to the simulation task. Such architectures have the 
potential for significant speed-ups over software-based logic simulators executing on 
standard sequential computers. This paper presents a model of one class of multiprocessor 
simulation architectures and compares the performance of some of these machines using 
data obtained from simulations of VLSI circuits. I ... 

A hardware accelerator for speech recognition algorithms 
T. S. Anantharaman, R. Bisiani 

June 1986 ACM SIGARCH Computer Architecture News , Proceedings of the 13th 

annual international symposium on Computer architecture, volume 14 issue 2 

Additional Information: full citation, abstract , references , citings , index 
terms 



Full text available: g pdf(729.28 KB) 



This paper describes two custom architectures tailored to a speech recognition beam search 
algorithm. Both architectures have been simulated using real data and the results of the 
simulation are presented. The paper also describes the design process of the custom 
architectures and presents a number of ideas on the automatic design of custom systems for 
data dependent computations. 

A hardware accelerator for maze routing | 
Y. Won # S. Sahni, Y. El-ziq 

October 1987 Proceedings of the 24th ACM/IEEE conference on Design automation 

Full text available: ^ pdf(871 .73 KB) Additional Information: full citation , abstract , references , index terms 

A hardware accelerator for the maze routing problem is developed. This accelerator consists 
of three 3 stage pipelines. Banked memory is used to avoid memory read/write conflicts and 
obtain maximum efficiency. 
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D. Rhoden, C. Wilcox 

July 1989 ACM SIGGRAPH Computer Graphics , Proceedings of the 16th annual 

conference on Computer graphics and interactive techniques, Volume 23 issue 3 
Full text available: ■ gpdfd.81 MB) Additional Information: full citation , abstract , references , citings, index 

Graphics pipelines are quickly evolving to support multitasking workstations. The driving 
force behind this evolution is the window system, which must provide high performance 
graphics within multiple windows, while maintaining interactivity. The virtual graphics 
system presented by [7] provides a clean solution to the problem of context switching 
graphics hardware between processes, but does not solve all the problems associated with 
sharing graphics pipelines.The primary difficulty in context ... 

5 A vector hardware accelerator with circuit simulation emphasis 

A. Vladimirescu, D. Weiss, M. Katevenis, Z. Bronstein, A. Kifir, K. Danuwidjaja, K. C. Ng., N. 
Jain, S. Lass 

October 1987 Proceedings of the 24th ACM/IEEE conference on Design automation 

Full text available: |f| pdf(591.83 KB) Additional Information: full citation , abstract , references , index terms 

A floating-point vector accelerator has been built which runs circuit simulation efficiently. 
The design considerations of the accelerator are based on the time-consuming parts of 
SPICE2, available off-the-shelf parts, advanced software tools experience and 
cost/performance. The three board accelerator can run the entire application program 
compiled from a high-level language. A personal workstation, such as the PC -AT, is used for 
the general I/O tasks such as file handling and n ... 

6 Architecture and design of the MARS hardware accelerator 

P. Agrawal, W. J. Dally, A. K. Ezzat, W. C. Fischer, H. V. Jagadish, A. S. Krishnakumar 
October 1987 Proceedings of the 24th ACM/IEEE conference on Design automation 

Full text available: fgl pdfH.49 MB) Additional Information: full citation , abstract , references , citings , index 
™ terms 

MARS (Microprogrammable Accelerator for Rapid Simulations) is a multiprocessor based 
hardware accelerator capable of efficiently implementing a wide range of computationally 
complex algorithms. Its architecture is ideally suited for performing event driven simulations 
of VLSI circuits. The highly pipelined and parallel architecture of MARS provides a 
performance comparable to existing hardware simulation engines while its highly flexible 
architecture supports a wide range of applications. F ... 

7 Overview of a high-performance programmable pipeline structure 
Franc, ois Bodin, Franc, ois Charot, Charles Wagner 

June 1986 Proceedings of the 3rd international conference on Supercomputing 

Full text available: ^ pdf(2.05 MB) Additional Information: full citation , abstract , references, index terms 

This paper aims at describing a high-performance programmable pipeline architecture 
consisting of a linear array of PCS processors. The PCS processor which is capable of 
performing 20 million floating-point operations per second (20 MFLOPS) has been built from 
off-the-shelf chips on a wire-wrapped board. The prototype processor is attached to a SUIM-3 
workstation. Efficient microcode is generated using the microcode compiler that has been 
designed and implemented. The microcode op ... 

8 Texture shaders | 
Michael D. McCool, Wolfgang Heidrich 

July 1999 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics 
hardware 

Full text available: ffi pdfd .36 MB) Additional Information: full citation , references , citings , index terms 
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9 The Parallel Protocol Engine Q 
Matthias Kaiserswerth 

December 1993 IEEE/ACM Transactions on Networking (TON), Volume l issue 6 

Full text available: *g pdf(1.65 MB) Additional Information: full citation , references , citings , index terms , review 



10 Continuous profiling: where have all the cycles gone? 

Jennifer M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika R. Henzinger, 
Shun-Tak A. Leung, Richard L. Sites, MarkT. Vandevoorde, Carl A. Waldspurger, William E. 
Weihl 

October 1997 ACM SIGOPS Operating Systems Review , Proceedings of the sixteenth 

ACM symposium on Operating systems principles, Volume 31 issue 5 
Full text available: ^ pdf(2.29 MB) Additional Information: full citation , references , citings, index terms 



11 Continuous profiling: where have all the cycles gone? 

Jennifer M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika R. Henzinger, 
Shun-Tak A. Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger, William E. 
Weihl 

November 1997 ACM Transactions on Computer Systems (TOCS), Volume 15 issue 4 

Full text available* HI Ddf(259 35 KB) Additional Information: full citation , abstract , references , citings , index 
" » E - J ! terms 

This article describes the Digital Continuous Profiling Infrastructure, a sampling-based 
profiling system designed to run continuously on production systems. The system supports 
multiprocessors, works on unmodified executables, and collects profiles for entire systems, 
including user programs, shared libraries, and the operating system kernel. Samples are 
collected at a high rate (over 5200 samples/sec. per 333MHz processor), yet with low 
overhead (1-3% slowdown for most workloads). A ... 

Keywords: performance understanding, performance-monitoring hardware, profiling, 
program analysis 



12 Hardware speedups in long integer multiplication Q 
M. Shand, P. Bertin, J. Vuillemin 

May 1990 Proceedings of the second annual ACM symposium on Parallel algorithms 
and architectures 

Full text available: ^ pdf(939.04 KB) Additional Information: full citation , references , citings , index terms 



13 Parallel logic simulation of VLSI systems 

Mary L. Bailey, Jack V. Briner, Roger D. Chamberlain 

September 1994 ACM Computing Surveys (CSUR), Volume 26 issue 3 

Full text available: flB pdf(3.74 MB) Additional Information: full citation , abstract , references , citings, index 
^ terms 
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Fast, efficient logic simulators are an essential tool in modern VLSI system design. Logic 
simulation is used extensively for design verification prior to fabrication, and as VLSI 
systems grow in size, the execution time required by simulation is becoming more and more 
significant. Faster logic simulators will have an appreciable economic impact, speeding time 
to market while ensuring more thorough system design testing. One approach to this 
problem is to utilize parallel processing, taking ... 

Keywords: circuit structure, parallel architecture, parallelism, partitioning, synchronization 
algorithm, timing granularity 



14 Reconfigurable technology: an innovative solution for parallel discrete event simulation Q 
support 

C. Beaumont, P. Boronat, J. Champeau, J.-M Filloque, B. Pottier 

July 1994 ACM SIGSIM Simulation Digest , Proceedings of the eighth workshop on 
Parallel and distributed simulation, volume 24 issue l 

Additional Information: full citation , abstract , references , citings , index 



Full text available: ' 

terms 

Accelerating discrete event simulation can be achieved by using parallel architectures. The 
use of dedicated hardware is a possible alternative in some special domains like logic 
simulation. However, few studies have focused on general cases. This paper presents an 
innovative solution using a recent hardware technology called FPGA (Field Programmable 
Gate Array), that enables dynamic synthesis of application specific hardware. Each node of 
an MIMD parallel machin ... 

15 The M-Machine multicomputer 

Marco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny 
Gurevich, Whay S. Lee 

December 1995 Proceedings of the 28th annual international symposium on 
M ic roa rc h itect u re 

Full text available: 1|| pdf(1.29 MB) Additional Information: full citation , references , citings , index terms 



16 Hybrid volume and polygon rendering with cube hardware Q 
Kevin Kreeger, Arie Kaufman 

July 1999 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics 
hardware 

Full text available: ^| pdf(1.85 MB) Additional Information: full citation , references , citings , index terms 



Keywords: cube architecture, mixing polygons and volumes, ray casting, run-length- 
encoding, volume rendering 



17 The design of a parallel graphics interface 
Homan Igehy, Gordon Stoll, Pat Hanrahan 

July 1998 Proceedings of the 25th annual conference on Computer graphics and 
interactive techniques 

Full text available: fpl pdf(389.52 KB) Additional Information: full citation, references , citings , index terms 
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Edward Gehringer, Janne Abullarade, Michael H. Gulyn 

September 1988 ACM SIGARCH Computer Architecture News, Volume 16 issue 4 

Full text available: ^ pdf(2.96 MB) Additional Information: full citation , abstract , citings , index terms 

This paper compares eight commercial parallel processors along several dimensions. The 
processors include four shared-bus multiprocessors (the Encore Multimax, the Sequent 
Balance system, the Alliant FX series, and the ELXSI System 6400) and four network 
multiprocessors (the BBN Butterfly, the NCUBE, the Intel iPSC/2, and the FPS T Series). The 
paper contrasts the computers from the standpoint of interconnection structures, memory 
configurations, and interprocessor communication. Also, the share ... 

19 VC-1: a scalable graphics computer with virtual local frame buffers Q 
Satoshi Nishimura, Tosiyasu L. Kunii 

August 1996 Proceedings of the 23rd annual conference on Computer graphics and 
interactive techniques 

Full text available: ^ pdf(266.19 KB) Additional Information: full citation , references , index terms 



Keywords: demand paging, frame buffers, parallel polygon rendering, scalable 



20 Scalable distributed visualization using off-the-shelf components 
Alan Heirich, Laurent Moll 

October 1999 Proceedings of the 1999 IEEE symposium on Parallel visualization and 
graphics 

Full text available* Hi Ddfd 81 MB) Additional Information: full citation , abstract , references , citings , index 
* 1 terms 

This paper describes a visualization architecture for scalable computer systems. The 
architecture is currently being prototyped for use in Beowulf-class clustered systems. A set 
of OpenGL frame buffers are driven in parallel by a set of CPUs. The visualization 
architecture merges the contents of these frame buffers by user-programmable associative 
and commulative combining operations. The system hardware is built from off-the-shelf 
components including OpenGL accelerators, Field Programmabl ... 

Keywords: Beowulf, FPGA, OpenGL, cluster, fat-tree, gigabit, visualization 
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